24 research outputs found

    A Developmental Model of Trust in Humanoid Robots

    Get PDF
    Trust between humans and artificial systems has recently received increased attention due to the widespread use of autonomous systems in our society. In this context trust plays a dual role. On the one hand it is necessary to build robots that are perceived as trustworthy by humans. On the other hand we need to give to those robots the ability to discriminate between reliable and unreliable informants. This thesis focused on the second problem, presenting an interdisciplinary investigation of trust, in particular a computational model based on neuroscientific and psychological assumptions. First of all, the use of Bayesian networks for modelling causal relationships was investigated. This approach follows the well known theory-theory framework of the Theory of Mind (ToM) and an established line of research based on the Bayesian description of mental processes. Next, the role of gaze in human-robot interaction has been investigated. The results of this research were used to design a head pose estimation system based on Convolutional Neural Networks. The system can be used in robotic platforms to facilitate joint attention tasks and enhance trust. Finally, everything was integrated into a structured cognitive architecture. The architecture is based on an actor-critic reinforcement learning framework and an intrinsic motivation feedback given by a Bayesian network. In order to evaluate the model, the architecture was embodied in the iCub humanoid robot and used to replicate a developmental experiment. The model provides a plausible description of children's reasoning that sheds some light on the underlying mechanism involved in trust-based learning. In the last part of the thesis the contribution of human-robot interaction research is discussed, with the aim of understanding the factors that influence the establishment of trust during joint tasks. Overall, this thesis provides a computational model of trust that takes into account the development of cognitive abilities in children, with a particular emphasis on the ToM and the underlying neural dynamics.THRIVE, Air Force Office of Scientific Research, Award No. FA9550-15-1-002

    Emotion Recognition in the Wild using Deep Neural Networks and Bayesian Classifiers

    Full text link
    Group emotion recognition in the wild is a challenging problem, due to the unstructured environments in which everyday life pictures are taken. Some of the obstacles for an effective classification are occlusions, variable lighting conditions, and image quality. In this work we present a solution based on a novel combination of deep neural networks and Bayesian classifiers. The neural network works on a bottom-up approach, analyzing emotions expressed by isolated faces. The Bayesian classifier estimates a global emotion integrating top-down features obtained through a scene descriptor. In order to validate the system we tested the framework on the dataset released for the Emotion Recognition in the Wild Challenge 2017. Our method achieved an accuracy of 64.68% on the test set, significantly outperforming the 53.62% competition baseline.Comment: accepted by the Fifth Emotion Recognition in the Wild (EmotiW) Challenge 201

    Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization

    Get PDF
    The autonomous landing of an Unmanned Aerial Vehicle (UAV) on a marker is one of the most challenging problems in robotics. Many solutions have been proposed, with the best results achieved via customized geometric features and external sensors. This paper discusses for the first time the use of deep reinforcement learning as an end-to-end learning paradigm to find a policy for UAVs autonomous landing. Our method is based on a divide-and-conquer paradigm that splits a task into sequential sub-tasks, each one assigned to a Deep Q-Network (DQN), hence the name Sequential Deep Q-Network (SDQN). Each DQN in an SDQN is activated by an internal trigger, and it represents a component of a high-level control policy, which can navigate the UAV towards the marker. Different technical solutions have been implemented, for example combining vanilla and double DQNs, and the introduction of a partitioned buffer replay to address the problem of sample efficiency. One of the main contributions of this work consists in showing how an SDQN trained in a simulator via domain randomization, can effectively generalize to real-world scenarios of increasing complexity. The performance of SDQNs is comparable with a state-of-the-art algorithm and human pilots while being quantitatively better in noisy conditions

    Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation

    Full text link
    In this paper we explore few-shot imitation learning for control problems, which involves learning to imitate a target policy by accessing a limited set of offline rollouts. This setting has been relatively under-explored despite its relevance to robotics and control applications. State-of-the-art methods developed to tackle few-shot imitation rely on meta-learning, which is expensive to train as it requires access to a distribution over tasks (rollouts from many target policies and variations of the base environment). Given this limitation we investigate an alternative approach, fine-tuning, a family of methods that pretrain on a single dataset and then fine-tune on unseen domain-specific data. Recent work has shown that fine-tuners outperform meta-learners in few-shot image classification tasks, especially when the data is out-of-domain. Here we evaluate to what extent this is true for control problems, proposing a simple yet effective baseline which relies on two stages: (i) training a base policy online via reinforcement learning (e.g. Soft Actor-Critic) on a single base environment, (ii) fine-tuning the base policy via behavioral cloning on a few offline rollouts of the target policy. Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment. Importantly, the proposed approach is practical and easy to implement, as it does not need any complex meta-training protocol. As a further contribution, we release an open source dataset called iMuJoCo (iMitation MuJoCo) consisting of 154 variants of popular OpenAI-Gym MuJoCo environments with associated pretrained target policies and rollouts, which can be used by the community to study few-shot imitation learning and offline reinforcement learning
    corecore